Quality-Aware Integration and Warehousing of Genomic Data
نویسندگان
چکیده
In human health and life sciences, researchers extensively collaborate with each other, sharing biomedical and genomic data and their experimental results. This necessitates dynamically integrating different databases or warehousing them into a single repository. Based on our past experience of building a data warehouse called GEDAW (Gene Expression Data Warehouse) that stores data on genes expressed in the liver during iron overload and liver pathologies, and also relevant information from public databanks (mostly in XML format), DNA chips home experiments and medical records, we present the lessons learned, the data quality issues in this context and the current solutions we propose for integrating and warehousing biomedical data. This paper provides a functional and modular architecture for data quality enhancement and awareness in the complex processes of integration and warehousing of biomedical data.
منابع مشابه
Context-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...
متن کاملKey Issues in Achieving Data Quality and Consistency in Data Warehousing among Large Organizations in Australia
This paper discusses the emergent key issues of data quality in a data warehousing environment. The research study leading to our outcome is described. We investigate the relationship between data quality and data consistency; determine whether data inconsistencies are present in data warehouses and explore how organisations ensure, plan and maintain data quality. Our research outcome an improv...
متن کاملFoundations of Data Warehouse Quality Source Integration in Data Warehousing
Source Integration is one of the core problems in Data Warehousing. Two critical factors for the design and maintenance of applications requiring Source Integration, and in particular Data Warehouse applications, are conceptual modeling of the domain, and reasoning support over the conceptual representation. We present a novel approach to conceptual modeling for Source Integration, which allows...
متن کاملDatabase Challenges in the Integration of Biomedical Data Sets
The clinical and basic science research domains present exciting and difficult data integration issues. Solving these problems is crucial as current research efforts in the field of biomedicine heavily depend upon integrated storage, querying, analysis, and visualization of clinicopathology information, genomic annotation, and large scale functional genomic research data sets. Such large scale ...
متن کامل